42 research outputs found

    Driving Context into Text-to-Text Privatization

    Full text link
    \textit{Metric Differential Privacy} enables text-to-text privatization by adding calibrated noise to the vector of a word derived from an embedding space and projecting this noisy vector back to a discrete vocabulary using a nearest neighbor search. Since words are substituted without context, this mechanism is expected to fall short at finding substitutes for words with ambiguous meanings, such as \textit{'bank'}. To account for these ambiguous words, we leverage a sense embedding and incorporate a sense disambiguation step prior to noise injection. We encompass our modification to the privatization mechanism with an estimation of privacy and utility. For word sense disambiguation on the \textit{Words in Context} dataset, we demonstrate a substantial increase in classification accuracy by 6.05%6.05\%

    Guiding Text-to-Text Privatization by Syntax

    Full text link
    Metric Differential Privacy is a generalization of differential privacy tailored to address the unique challenges of text-to-text privatization. By adding noise to the representation of words in the geometric space of embeddings, words are replaced with words located in the proximity of the noisy representation. Since embeddings are trained based on word co-occurrences, this mechanism ensures that substitutions stem from a common semantic context. Without considering the grammatical category of words, however, this mechanism cannot guarantee that substitutions play similar syntactic roles. We analyze the capability of text-to-text privatization to preserve the grammatical category of words after substitution and find that surrogate texts consist almost exclusively of nouns. Lacking the capability to produce surrogate texts that correlate with the structure of the sensitive texts, we encompass our analysis by transforming the privatization step into a candidate selection problem in which substitutions are directed to words with matching grammatical properties. We demonstrate a substantial improvement in the performance of downstream tasks by up to 4.66%4.66\% while retaining comparative privacy guarantees

    PREDICTIVE BUSINESS PROCESS MONITORINGWITH CONTEXT INFORMATION FROM DOCUMENTS

    Get PDF
    Predictive business process monitoring deals with predicting a process’s future behavior or the value of process-related performance indicators based on process event data. A variety of prototypical predictive business process monitoring techniques has been proposed by researchers in order to help process participants performing business processes better. In practical settings, these techniques have a low predictive quality that is often close to random, so that predictive business process monitoring applications are rare in practice. The inclusion of process-context data has been discussed as a way to improve the predictive quality. Existing approaches have considered only structured data as context. In this paper, we argue that process-related unstructured documents are also a promising source for extracting process-context data. Accordingly, this research-in-progress paper outlines a design-science research process for creating a predictive business process monitoring technique that utilizes context data from process-related documents to predict a process instance’s next activity more accurately

    A next click recommender system for web-based service analytics with context-aware LSTMs

    Get PDF
    Software companies that offer web-based services instead of local installations can record the user’s interactions with the system from a distance. This data can be analyzed and subsequently improved or extended. A recommender system that guides users through a business process by suggesting next clicks can help to improve user satisfaction, and hence service quality and can reduce support costs. We present a technique for a next click recommender system. Our approach is adapted from the predictive process monitoring domain that is based on long short-term memory (LSTM) neural networks. We compare three different configurations of the LSTM technique: LSTM without context, LSTM with context, and LSTM with embedded context. The technique was evaluated with a real-life data set from a financial software provider. We used a hidden Markov model (HMM) as the baseline. The configuration LSTM with embedded context achieved a significantly higher accuracy and the lowest standard deviation

    GAM(e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints

    Get PDF
    The number of information systems (IS) studies dealing with explainable artificial intelligence (XAI) is currently exploding as the field demands more transparency about the internal decision logic of machine learning (ML) models. However, most techniques subsumed under XAI provide post-hoc-analytical explanations, which have to be considered with caution as they only use approximations of the underlying ML model. Therefore, our paper investigates a series of intrinsically interpretable ML models and discusses their suitability for the IS community. More specifically, our focus is on advanced extensions of generalized additive models (GAM) in which predictors are modeled independently in a non-linear way to generate shape functions that can capture arbitrary patterns but remain fully interpretable. In our study, we evaluate the prediction qualities of five GAMs as compared to six traditional ML models and assess their visual outputs for model interpretability. On this basis, we investigate their merits and limitations and derive design implications for further improvements
    corecore